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Abstract 

We have investigated simulation-based techniques for parameter estimation in chaotic intercellular networks. The proposed 
methodology combines a synchronization-based framework for parameter estimation in coupled chaotic systems with 
some state-of-the-art computational inference methods borrowed from the field of computational statistics. The first 
method is a stochastic optimization algorithm, known as accelerated random search method, and the other two techniques 
are based on approximate Bayesian computation. The latter is a general methodology for non-parametric inference that 
can be applied to practically any system of interest. The first method based on approximate Bayesian computation is a 
Markov Chain Monte Carlo scheme that generates a series of random parameter realizations for which a low synchronization 
error is guaranteed. We show that accurate parameter estimates can be obtained by averaging over these realizations. The 
second ABC-based technique is a Sequential Monte Carlo scheme. The algorithm generates a sequence of "populations", 
i.e., sets of randomly generated parameter values, where the members of a certain population attain a synchronization error 
that is lesser than the error attained by members of the previous population. Again, we show that accurate estimates can be 
obtained by averaging over the parameter values in the last population of the sequence. We have analysed how effective 
these methods are from a computational perspective. For the numerical simulations we have considered a network that 
consists of two modified repressilators with identical parameters, coupled by the fast diffusion of the autoinducer across the 
cell membranes. 
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Introduction 

Most dynamical systems studied in the physical, biological and 
social sciences that exhibit a rich dynamical behavior can be 
modeled by sets of nonlinear differential equations. These 
mathematical models are a useful tool to predict complex 
behaviors using numerical simulations. However, for the vast 
majority of systems, and particularly for biological systems, we lack 
a reliable description of the parameters of the model. In this paper 
we are interested in parameter estimation for coupled intercellular 
networks displaying chaotic behavior, since models of spontane- 
ously active neural circuits typically exhibit chaotic dynamics (for 
example, spiking models of spontaneous activity in cortical circuits 
[1-3] and the analogous spontaneously active firing-rate model 
networks [4,5]). 

The problem of parameter estimation can be tackled in different 
ways, e.g., using multiple shooting methods [6-8] or some 
statistical procedures based on time discretizations and other 
approximations [9-13]. These methods involve the solution of 
high-dimensional minimization problems, since not only the 
unknown parameters but also the initial values of the trajectory 
segments between the sampling times need to be estimated [7,14]. 



This is specially difficult when working with chaotic systems since 
very complicated error landscapes with many local minima can 
appear. In particular, notice that chaotic systems have an 
exponential sensitivity to initial conditions, that is, completely 
different trajectories can be obtained for identical parameter 
values and very similar initial conditions of the system variables. 

On the other hand, several authors have suggested to take 
advantage of synchronization techniques for coupled chaotic 
systems and turn them into accurate parameter estimation 
methods [14-29]. There is a variety of techniques that rely on 
the synchronization properties of chaotic systems in order to tackle 
the parameter estimation problem. For example, some authors 
have proposed to handle the parameters as additional variables 
whose dynamics is described by tailored differential equations 
designed to have a fixed point at the true parameter values [14— 
20] . Adaptive estimation techniques relying on the time discretiza- 
tion of the state trajectories [22,23,30] and Monte Carlo methods 
[29,31], including particle niters [32,33], have also been investi- 
gated. All these techniques address the estimation of the system 
parameters independently of the initial conditions of the dynamic 
variables. 
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In this paper we investigate techniques that combine a 
synchronization-based framework for parameter estimation in 
coupled chaotic systems with some state-of-the-art computational 
inference methods borrowed from the recent literature in 
computational statistics. In particular, we describe estimation 
methods based on the accelerated random search (ARS) optimi- 
zation algorithm [29,31,34] and two approximate Bayesian 
computation (ABC) [35-39] schemes. ABC is a general method- 
ology for non-parametric inference that can be applied to 
practically any system of interest. We investigate two ABC-based 
methods. The first one is a Markov Chain Monte Carlo (MCMC) 
scheme [37] that generates a series of random parameter 
realizations for which a low synchronization error is guaranteed. 
We show that accurate parameter estimates can be obtained by 
averaging over these realizations. The second ABC scheme is 
termed Sequential Monte Carlo (SMC) ABC [38]. It generates a 
sequence of "populations", i.e., sets of randomly generated 
parameter values, where the members of the n — th population 
attain a synchronization error that is lesser than the error attained 
by members of the (n—Y) — th population. Again, we show how 
very accurate estimates can be obtained by averaging over the 
parameter values in the last population of the sequence. For the 
numerical simulations we consider a network of two coupled 
repressilators, since the repressilator is a paradigmatic gene 
regulatory system. 

Methods 

Structure of the Model 

The repressilator is a prototype of a synthetic genetic clock built 
by three genes and the protein product of each gene represses the 
expression of another in a cyclic manner [40]. It can be 
constructed experimentally and produce near harmonic oscilla- 



tions in protein levels. In the original repressilator design [40], the 
gene lad expresses protein LacI, which inhibits transcription of the 
gene tetR. The product of the latter, TetR, inhibits transcription of 
the gene cl. Finally, the protein product CI of the gene cl inhibits 
expression of lad and completes the cycle. (See left-hand module 
in Fig. 1 of Ref. [41]). 

Cell-to-cell communication was introduced to the repressilator 
by Garcia-Ojalvo and coworkers [42] by introducing an additional 
feedback loop to the network scheme that is based on the quorum 
sensing mechanism. The additional genetic module, which might 
be placed on a separate plasmid involves two other proteins [42— 
44] : Luxl, which produces a small autoinducer (AI) molecule that 
can diffuse through the cell membrane, and LuxR, which 
responds to the autoinducer by activating transcription of a 
second copy of the repressilator gene lad. The additional quorum 
sensing feedback loop can be connected to the basic repressilator 
in such a way that it reinforces the oscillations of the repressilator 
or competes with the overall negative feedback of the basic 
repressilator. The first one leads to phase attractive coupling for a 
robust synchronised oscillation [42] whereas the latter one evokes 
phase-repulsive influence [45-47], which is the key to multi- 
stability and a very rich dynamics including chaotic oscillations 
[41,48,49]. Placing the gene luxl under inhibitory control of the 
repressilator protein TetR (see Fig. 1 bottom in Ref. [41]) leads to 
the desired repressive and phase-repulsive coupling. We term this 
system modified repressilator. Phase repulsive coupling is common 
in several biological systems, e.g. in neural activity in the brain of 
songbirds [50], in the respiratory system [51], in the jamming 
avoidance response in electrical fish [52] and in the morphogenesis 
in Hydra regeneration and animal coat pattern formation [53]. 

In particular, in this work we are going to consider two modified 
repressilators with identical parameters, coupled by the fast 
diffusion of the autoinducer (AI) across the cell membranes. The 
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Figure 2. Synchronization between the primary and secondary systems, (left) Synchronization error, |Si— fli|, between both systems for 
D = 20. (right) The first variable of the secondary system (_>>i =a\) versus the first variable of the primary system [x\ =a{). 
doi:10.1371/journal.pone.0079892.g002 



mRNA dynamics is described by the following Hill-type kinetics 
with Hill coefficient n: 



b - = - b ' + Y^' 



The third term on the right-hand side of Eq. (3) represents 
activated production of lad by the AI molecule, whose concen- 
tration inside cell i is denoted by Si. The dynamics of CI and Luxl 
can be considered identical, assuming equal lifetimes of the two 
proteins and given that their production is controlled by the same 
protein (TetR). Hence, the synthesis of the AI Si can be considered 
to be controlled by the concentration Bj of the protein CI. Taking 
also into account the intracellular degradation of the AI and its 
diffusion toward or from the intercellular space, the dynamics of 5, 
is given by 



<y- ko, 

Ci= - C ' + TT^ + TT^' (3) 

where the subscript i'=l,2 specifies the cell, and a,-, bi, and c,- 
represent the concentrations of mRNA molecules transcribed from 
the genes of MR, cl, and lad, respectively. The parameter a is the 
dimensionless transcription rate in the absence of a repressor. The 
parameter K is the maximum transcription rate of the LuxR 
promoter. 

The protein dynamics is given by 

Ai = fSM-Ai), (4) 
Bi = p h (bi-Bi), (5) 

C, = p c (c,-C,) (6) 

where variables Aj, Bj, and C, denote the concentration of the 
proteins TetR, CI, and LacI, respectively. The dynamics of the 
proteins is linked to the amount of the responsible mRNA, and the 
parameters P a j, c describe the ratio between mRNA and the 
protein lifetimes (inverse degradation rates). In what follows, we 
are going to assume Pt, = fl c - Thus, we can consider it as a single 
parameter. The model is made dimensionless by measuring time 
in units of the mRNA lifetime (assumed equal for all genes) and the 
mRNA and protein levels in units of their Michaelis constant. The 
mRNA concentrations are additionally rescaled by the ratio of 
their protein degradation and translation rates [42] . 



Si = - k s0 Si + k si Bi - r,(S t - S e ), (7) 

where the diffusion coefficient r\ depends on the permeability of 
the membrane to the AI. Because of the fast diffusion of the 
extracellular AI (S^j compared to the repressilator period, we can 
apply the quasi-steady-state approximation to the dynamics of the 
external AI [42], which leads to 



S L , = QS=Q±J2Si. (8) 
i=i 

The parameter Q_is defined as 

Q = (SN/ V ext )/(k se + SN/ Vex,) (9) 

where N = 2 is the number of cells, V ext is the total extracellular 
volume, k se is the extracellular AI degradation rate, and 5 is the 
product of the membrane permeability and the surface area. 

We achieve chaotic behavior for the following parameter values 
[41]: 2 = 0.85, n = 2.6, <x = 216, ft, = 0.85, 0 6 = O.l, r\ = 2, k = 25, 
At,so = 1, an d k s \ =0.01. In particular, these are the values we are 
considering throughout this manuscript in order to assess the 
parameter estimation algorithms. 

To see how the system behaves around these values we have 
plotted some bifurcation diagrams taking in each one a different 
parameter as a control parameter, whereas the rest of the 
parameters remain constant on the values mentioned above. 
Figure 1(A-I) represents the bifurcation diagrams versus the 
different control parameters. In particular, in each plot we are 
representing all maxima of the variable flj for some fixed initial 
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condition as a function of the corresponding control parameter. 
We can observe how for some values of the control parameters the 
behavior changes from periodic to chaotic and vice versa. The 
vertical red dashed lines correspond to the values we are 
considering to assess the parameter estimation algorithms. Notice 
how for these values the system can display chaotic behavior. 

Problem Statement 

We first introduce the notation to be used in the description of 
the parameter estimation methodologies. Let 



x = f(x,p) 



(10) 



represent the chaotic intercellular network (consisting of two 
coupled modified repressilators), with state variables 



x = Oi, . . . ,xu) 
= (.a\,b\,c\,A\,B\,C\,S\,a 2 ,b 2 ,c 2 ,A 2 ,B 2 ,C 2 ,S 2 ) 

and unknown parameters 

p = (p u . . . ,p 9 ) 

= (Q,P a ,n,a,P h ,K,k s0 ,k sl ,ri). 



(11) 



(12) 



The vector-valued function f can be explicitly written down by 
comparing Eq. (10) with Eqs. (l)-(7). In the sequel we refer to the 
system of Eq. (10) as primary system. 

Since f in Eq. ( 1 0) is known, we can build a model with identical 
functional form but adjustable parameters and a coupling term, 
termed secondary system in the sequel, as 



y = f(y,q) + Z>s, v (x-y), 



(13) 



where 



where £=1,2 denotes the cell number (i.e., the repressilator index). 
Thus, the coupling only appears in two of the fourteen differential 
equations we have in the secondary system. 

Since we assume identical functional form for the primary and 
secondary systems, when the secondary parameter vector, q, 
coincides with the primary parameter vector, p, the state variables 
y synchronize with x for D> D c , where D c is a coupling threshold 
[29]. On the contrary, if q^p then identical synchronization 
cannot occur. However, the difference ||y — x|| is expected to be 
small whenever the difference of the two parameter vectors is small 
and D is sufficientiy large. Therefore, the objective of a parameter 
estimation method based on synchronization is to compute a value 
q such that ||y — x[| «0 over time, since the latter implies q«p. 

Figure 2(left) shows how the synchronization error \a\— a\\ 
evolves in time when considering D = 20 and identical parameters 
for the primary and the secondary systems. We can see how this 
error can be very small, less that 10~ 10 . We are going to fix Z> = 20 
in all simulations throughout the paper. In Fig. 2(right) we are 
representing one variable of the secondary system {y\ =a\) versus 
the same variable of the primary system (x\ =ci\) and observing 
the typical straight line, characteristic of the synchronization 
phenomenon. 

It has been verified by means of numerical simulations that in 
order to obtain synchronization at least two coupling variables are 
needed (excluding S\ or S 2 ), one from each repressilator in the 
primary system. Therefore, other combinations different from 
(ai,a 2 ) are also possible, as for example (61,^2)- Coupling via a 
single variable is not sufficient to guarantee that the secondary 
system synchronizes with the primary one. 

Results 

Parameter Estimation Methods 

Here we introduce three parameter estimation methods that 
combine the synchronization-based framework described above 
with state-of-the-art computational methods. In particular, we 
present results for the joint estimation of four parameters, 



(14) 



= (ai M ,cj ,Ai ,B X ,C\,S\ ,a 2 ,b 2 ,c 2 ,A 2 ,B 2 ,C 2 ,S 2 ) 
is the time-varying vector that contains the model state variables, 



q = (?i,...,?9) 



(15) 



is the adjustable parameter vector, D is a coupling coefficient and 
Sjj : R U —>M 14 is a vector function that selects the i-th and the j-th 
element of its argument, i.e., 



s y (x-y) = (0,...,0,x i -y i ,0,...,0,x J -y j ,0,...,0). (16) 

The definition of the latter function implies that coupling is 
performed only through two scalar time series, X; and Xj, from the 
primary system. In particular, the coupling scheme we have 
chosen for the simulation setup in this paper is 



-«; + 



1 + Cf 



+ £>(«,- -a,), 



P = (P\ ,P2,P3,P4) = (Q,P a ,n,Cc), 



(18) 



in the coupled modified repressilator network. In all simulations 
shown in this manuscript we have assumed that just two scalar 
signals from the primary system are observed, namely the coupling 
variables (a\,a 2 ), and we have numerically integrated the 
secondary system using a second order Runge-Kutta method with 
a time step T s = 0.005 time units (t.u.). 

Accelerated Random Search 

We first focus on a parameter estimation method for chaotic 
intercellular networks that takes advantage of chaos synchroniza- 
tion and is based on an efficient Monte Carlo optimization 
procedure, known as accelerated random search (ARS) method 
[29,31]. In particular, the method we are going to describe consists 
in a variation of the technique proposed in [3 1] . Assuming that the 
variables (a\,a 2 ) are observed during the time interval (0,T o ), the 
value of the parameters in the secondary system can be calculated 
as the solution to the optimization problem 



q = argmin{/(q)}, 
q 



(19) 



(17) where q = (qi, . . . ,q<x) = (Q,P a ,n,a) and the cost function 
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2. Choose an index je{l, . . . ,m}, define 



0 

JV— 1 



(\ai -ai\ 2 + \a 2 -a 2 \ 2 



(20) 



N 



(iai(nT s )-a l (nT s )) 2 + (a 2 (nT s )-a 2 (nT s )) 2 ^ 



Jj,n-\{q) = 

J(qi(n-l), . . . ,qj-i(n- \),q,q j+ i(n-Y), ■ ■ ■ ,q m (n-l)) 



(22) 



is a quantitative representation of the synchronization error 
between the primary and the secondary systems. Notice that 
N=T 0 /T S , where T 0 = 10 s t.u. 

The method can be outlined as follows: 

1. Initialization. Choose: 

(a) initial parameter values 



q(0) = (? 1 (0),? 2 (0) > ...,? m (0)), 



(21) 



where m represents the number of parameters we want to 
estimate; 

(b) maximum and minumum "search variances" for each 
parameter, oj max and oj mjn , respectively, where 7 = 1,... ,m; 

(c) a "contraction factor" for each parameter, Cj>l, 
j=l,...,m. 

Set °f(l) = °f max for j=l, . . . ,m and, using the initial 
parameter vector, q(0), evaluate the associated cost /(q(0)). 

2. Iterative step. Let N(x\)i,a 2 ) denote the Gaussian probability 
distribution of the variable x with mean [i and variance a 2 . 
Assume we have computed a vector of parameter estimates, 
q(n— l) = (<?i(w— 1), . . . ,q m {n— 1)), with cost J(q(n— 1)). 

Choose an index je{\, . . . ,m}, define 



Jj,n-\(q) = J(qM- !)> • • ■ 

Mj+\(n-y),. ..,q„,(n-l)) 



(22) 



and set Oy(0) = a 2 max and qj(0) = c/j(n— 1). Choose an integer L 
(the number of iterations to be performed over each parameter) 
and carry out the following computations: 
For 1=1,..., L: 

The method can be outlined as follows: 
1 . Initialization. Choose: 
(a) initial parameter values 



q(0) = (^(0),ft(0),...^ m (0)), 



(21) 



where m represents the number of parameters we want to 
estimate; 

(b) maximum and minumum "search variances" for each 
parameter, ffj max and oj mj „, respectively, where j= 1, . . . ,m; 

(c) a "contraction factor" for each parameter, Cj>l, 
j=\,...,m. 

Set G 2 W = a j max for j=\,...,m and, using the initial 
parameter vector, q(0), evaluate the associated cost /(q(0)). 

2. Iterative step. Let N(x\fi,a 2 ) denote the Gaussian probability 
distribution of the variable x with mean /i and variance a 2 . 
Assume we have computed a vector of parameter estimates, 
q(«— l) = (qi(n— 1), . . . ,q m (n— 1)), with cost J(q(« — 1)). 



and set 0^(0) = aj max and <7y(0) = qj(n — 1). Choose an integer L 
(the number of iterations to be performed over each parameter) 
and carry out the following computations: 

For 1=1,..., L: 

(a) Draw q~N(q\qj(l- 

(b) Compute Jj^ n -\(q). 

(c) If J J , n - 1 (q)<Jj,n-l(q J (l-V) then 



■!),<#- 1)). 



else 



o 2 {l) = o 2 , 



q J (l) = qj(l-l), 



a}(D = a}(l-l)/Cj 



(23) 



(d) If ^(/K^, then 



(24) 



(25) 



Once the loop is completed, set qj{ri) = qj{L) and 
qk(n) = qk(n — 1) for every k # j. 

Let j = (/ +!) m °d m , set cj(0) = cr 2 max and perform the loop 
over (a)-(d) again. 

The algorithm can be stopped when the synchronization error 
J(q(n)) reaches a certain threshold or after a prescribed number of 
steps n (e.g., when n = n 0 for some sufficiently large n 0 ). Notice that 
the total number of iterations is n B L and the time evolution of the 
secondary system state, y, has to be integrated at each iteration, 
since a new candidate parameter vector is drawn each time. 

We have carried out numerical simulations where this Monte 
Carlo optimization algorithm has been iterated a total of 8 x 10 4 
times, with the number of consecutive iterations for each 
parameter (iterations per loop) set to L= 100, and the secondary 
system has been integrated for each iteration during T 0 = 1 x 10 s 
time units. 

The initial values of the parameters are sampled from a uniform 
distribution, namely 



e(0),/3 a (0),«(0)~t/(0,2); 



a(0)~ [7(100,500), 



(26) 
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the maximum search variances for each parameter are 



£ e = 2.7x 10- 8 ,£-8 =8.7 x 1(T 7 , 



2 _ 2 2 _9 

Q,max p a ,max n,max > 



£'„ = 1.6x 10- 8 ,£- a = l.l x 10" 



(31) 



<™« = 500, 



(27) 



the minimum search variances are 

4,™ = ^L™" = = 0-0001 , 

<™„=o.oi, 

and the contraction factors are 

c e = c A, =c„ = 2.5, 
Co, = 25. 



(28) 



(29) 



The normalized quadratic error for each parameter q,, 
1 = 1,... ,4, is defined as, 



E Pi (n) = {(pi-q i {n))/p ii 



(30) 



Figure 3(A-D) shows the normalized quadratic errors for each 
parameter as a function of the number of iterations (mL). We can 
see how after 3 x 10 4 iterations all errors are very low for the 
estimated parameters. The values of the normalized quadratic 
errors in the 30,000-th iteration are: 



Approximate Bayesian Computation 

In Bayesian estimation theory the parameters are modeled as 
random variables, rather than unknown but deterministic num- 
bers. Consequently, ABC-based methods aim at approximating 
the probability distribution of the parameters, vector q, conditional 
on the observations from the primary system. The technique, to be 
described next, is nonparametric, i.e., the distribution of q is 
approximated by a set of random samples. 

ABC methods have been conceived with the aim of inferring 
posterior distributions without having to compute likelihood 
functions [35-39] . The calculation of the likelihood is replaced 
by a comparison between the observed and the simulated data. In 
the setup of this paper, the comparison is carried out between the 
data from the primary system (the observations) and the data 
generated by the secondary system (the model with adjustable 
parameters). The comparison between these data represents, in 
our case, a measure of the synchronization error between these 
two systems. 

Let 71 denote the a priori probability density function (pdf) of the 
random parameter vector q, let /Xy|q) denote the probability 
distribution of the data y generated by the secondary system 
conditional on the realization of q and let <i(x,y) be a distance 
function that measures the synchronization error by comparing 
the observed time series x from the primary system and the 
generated time series y from the secondary one. Since the system 
of interest is deterministic, /(y|q) collapses into a Dirac delta 
measure when q is given. In the ABC framework, though, we are 
not interested in evaluating / but rather in generating y using the 
model conditional on q. This amounts to integrating the secondary 
system with a given realization of the adjustable parameters q. 

The simplest ABC algorithm is the ABC rejection sampler [35] . For 
a given synchronization error threshold (often termed tolerance) 
>0, the algorithm can be described as follows. 




x 10 



x 10 



Figure 3. Normalized quadratic errors for each parameter as a function of the number of iterations for the ARS algorithm. (A) Q, (B) 

P a , (C ) n and (D) a. 

doi:1 0.1 371 /journal.pone.0079892.g003 
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1. Sample q from 7t(q). 

2. Simulate a dataset y from the conditional probability 
distribution /(y|q). 

3. If the distance function (synchronization error) is <i(x,y)<e, 
then accept q, otherwise reject it. 

4. Return to the first step. 

The output of an ABC algorithm is a population of parameter 
values randomly drawn from the distribution 7i(q|c/(x,y)<), i.e., a 
distribution with density proportional to the prior but restricted to 
the set of values of q for which the synchronization error is at most 

The disadvantage of the ABC rejection sampler is that the 
acceptance rate is very low when the set of values of q for which 
the synchronization error is less than turns out to be relatively 
small. Thus, instead of implementing this method we have decided 
to implement two more sophisticated ABC algorithms. The first 
one is a Markov chain Monte Carlo (ABC MCMC) technique and 
the second one is a sequential Monte Carlo (ABC SMC) method. 

Approximate Bayesian Computation Markov Chain 
Monte Carlo 

The Approximate Bayesian Computation Markov Chain 
Monte Carlo (ABC MCMC) algorithm is a Metropolis-Hastings 
[37] MCMC method that incorporates one additional test to 
ensure that all parameters in the chain yield a synchronization 
error below the threshold . It can be outlined as follows. 

1. Initialize q(i'), i = 0. 

2. Generate q according to a proposal distribution g(q|q(0) 

3. Simulate a dataset y from the conditional probability 
distribution /(y|q). 

4. If the distance function (synchronization error) is c/(x,y)<£, go 
to the following step, otherwise set q(i + l) = q(/) and go to step 
6. 

5. Set q(/+ l) = q with probability 



£ = min 1 , 



rc(q)g(q(01q) \ (32) 
Vq(Ote(qlq(W 1 ' 

and q(/+ 1) = q(0 with probability 1 — £. 

1. Set i' = /+l, go to step 2. 

The outcome of this algorithm is a Markov chain with the 
stationary distribution 7t(q|c/(x,y)<f) [37]. The parameters are 
assumed independent a priori, hence 



7t(q) : 



j= l,,..,m 



(33) 



and we also choose independent proposals for simplicity. In 
particular, the prior distributions for each parameter have been 
chosen as 



n(Q) = n(fi a ) = K(n)=U(0,5) 



n(a)= £7(0,500), 



(34) 



and the proposal distribution for each parameter is a Gaussian 
distribution centered at the previous value of the corresponding 
parameter and with a fixed standard deviation, different for each 
parameter. In particular, 



<j(Q) = o(fi a ) = o(n)=\ 




80 200 220 240 260 



Figure 4. Histograms of the approximate marginal posterior distributions for each parameter for the ABC MCMC algorithm when 
considering e= 0.001. (A) Q, (B) f} a , (C ) n and (D) a. 
doi:1 0.1 371 /journal.pone.0079892.g004 
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ff(a)=100 



(35) 



and 



g(qlq)= 



n N(qj\q a( g/ )). 



(36) 



Since the proposal distribution is symmetric, g(q,-|q) = g(q|q,), and 
the prior is uniform, the acceptance probability in Eq.(32) is £= 1. 
The distance function (synchronization error) has been chosen 



that i > ... > t > 0, thus the empirical distributions gradually 
evolve towards the target posterior. The ABC SMC algorithm 
proceeds as follows [39]. 

1. Initialize fi > ... >er- Set the population indicator t = 0. 

2. Set the particle indicator ;'= 1. 

3. If t = 0, draw q* from the prior rc(q). 

Else, draw q* from 



g 1 (q)=J2n^{t-l)K 1 (q\q ii) (t-r)) 



(40) 



c?(y,x) = 



■T 0 
0 

N-l 



a\ ~ai\ 2 + \a2—a 2 \ 2 ^dt 



(37) 



((ai(nT s )-ai(nT s )) 2 + (a 2 (nT s )-a 2 (nT s j) 2 ^ 



where T a = 10 5 t.u. and NT S = Tq. This expression is equivalent to 
the cost function in the ARS method. The tolerance (threshold for 
the synchronization error) has been chosen as e = 0.001. 

A chain of 5 x 1 0 5 samples has been generated, what implies 
that the secondary system has been integrated 5 x 10 5 times. The 
initial point of the chain is selected to ensure that the associated 
distance is less than 0.05. Figure 4(A-D) shows the histograms of 
the approximate marginal posterior distributions for each param- 
eter. In order to reduce the strong correlation between consecutive 
samples in the Markov chain we have subsampled by a factor of 
50. We have calculated the mean values of each histogram as well 
as the normalized quadratic errors according to the following 
expression 



E Pi = ((Pi-qdlPi) 



(38) 



where <7,-, for i=l,...,4, represents the mean value of the 
histogram of the corresponding parameter. The values of the 
normalized quadratic errors for the estimated parameters are 



where K t (.\c() is a symmetric kernel centred around q' and 
w^ l \t—l),...,w (N \t—l) are importance weights such that 
£f =1 H%-D=l. 

4. If 7t(q*) = 0, return to step 3. 

5. Simulate a candidate dataset y/(y|q*). 

6. If d(x,y)>e, return to step 3. 

7. Set q*'*(?) = q* and calculate the weight, 



.,(0 . 



Uf 



N 

J2 ._ j ».>M((-U^(q«('-l)),q«(0) 



t = Q 
t>0 



(41) 



8. If i< N, set i = i+ 1 and go to step 3. 

9. Normalize the weights. 

10. If t< T, set t=t+l and go to step 2. Otherwise stop. 

The prior distributions we have considered for each parameter 
are n(Q) = n(fi a ) = n(n)=U(0,S) and 71(a) = [/(0,500), the same 
as for the ABC MCMC algorithm. The perturbation kernel K t is 
Gaussian, namely 



:1.3xl0- 4 ^„ 



= 5.9x 10" 



1.9x 10" 



= 5.7 x 10~ 



(39) 



We can see how three parameters are accurately estimated 
whereas for one of them, j8 a the error is significantly higher 
compared to (31). 

Approximate Bayesian Computation Sequential Monte 
Carlo 

A more sophisticated application of the ABC methodology is the 
Approximate Bayesian Computation Sequential Monte Carlo 
algorithm [39,54,55]. In ABC SMC, a number of sampled 
parameter values (often termed particles), {q' 1 ', . . . ,H (N ' 1 }, drawn 
from the prior distribution 7i(q), are propagated through a 
sequence of intermediate distributions n(q\d(x,y) < ,), 
i=l, . . . ,T — 1, until they are converted into samples from the 
target distribution 7t(q|c/(x,y) < 7-). The tolerances are chosen such 



W^-l))= nAf(qJ\qf(t-l),c(qj)), (42) 

with standard deviations G(Q) = o(fi a ) = o{n) = §.\ and o(a) = 10. 
The distance function (synchronization error) is the same as for the 
ABC MCMC and ARS algorithms with the same T 0 value. To 
ensure the gradual transition between populations, the ABC SMC 
algorithm is run for T= 12 populations with ={1,0.5,0.1,0.05, 
0.025,0.01,0.005,0.0025,0.001,0.0005,0.00025,0.0001} and we 
have considered N = 400 particles per population. 

Figure 5(A— D) shows the histograms of the approximate 
marginal posterior distributions for each parameter where the 
different weights of the different particles have been taken into 
account for the representation. We have calculated the mean 
values of each histogram (that match the actual values) as well as 
the normalized quadratic errors using Eq. (38), that is, the same 
expression as for the ABC MCMC algorithm. The values of the 
normalized quadratic errors for the estimated parameters with the 
ABC SMC method are 
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Figure 5. Histograms of the approximate marginal posterior distributions for each parameter for the ABC SMC algorithm when 
considering e= 0.0001. (A) Q, (B) )i a , (C ) n and (D) a. 
doi:1 0.1 371 /journal.pone.0079892.g005 



Eq = 5Ax Kr 7 ,£- ft/ =9.4x 1(T 4 , 



= 2.5 x 10- 



= 3.2x 10- 



Figure 6(A-D) shows the output (i.e. the accepted particles) of 
the ABC SMC algorithm as scatterplots of some of the two- 
dimensional parameter combinations, where we have information 
(43) of different populations in the same plot. As we iterate the 
algorithm we obtain populations that are more dense around the 
desired values, as shown in these plots, where the particles from 
the prior are represented in blue, particles from population 2 in 
green, particles from population 4 in light blue, population 6 in 
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Figure 6. Two-dimensional scatterplots for the accepted particles of the ABC SMC of each population. The particles from the prior are 
represented in blue, particles from population 2 in green, particles from population 4 in cyan, population 6 in pink, 8 in yellow, 10 in red and particles 
from the posterior (population 12) in black color. 
doi:1 0.1 371 /journal.pone.0079892.g006 
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Figure 7. Box plots diagrams for the different populations for 
each parameter. (A) Q, (B) [}„, (C ) n and (D) a. 

doi:1 0.1 371 /journal.pone.0079892.g007 



pink, 8 in yellow, 10 in red and particles from the posterior 
(population 12) in black color. We can see how for the last 
population the particles are tightly clustered around the desired 
value. 

In order to gain insight of how the parameters are estimated 
during the evolution of the algorithm, we have represented some 
box-plot diagrams, one for each parameter to be estimated, as 
seen in fig. 7(A-D). In each diagram, we have information about 
the corresponding parameter as a function of the population 
index. In particular, the central mark of each box is the median of 
the population, the edges of the box are the 25-th and 75-th 
percentiles, the whiskers extend to the most extreme data points 
not considered as oudiers, and the outliers are plotted individually 
using the plus symbols in red. The horizontal lines in red represent 
the actual values of the parameters we have used in our simulation. 
We can see from these plots how for a high enough population 
index the median values of the four parameters perfecdy match 
the actual values. 
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Figure 8. Computational cost for each population using the 
ABC SMC algorithm. 

doi:1 0.1 371 /journal.pone.0079892.g008 
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Figure 9. Computational complexity measured by the number 
of random samples generated by the algorithms. The solid blue 
line is the complexity of the ABC MCMC algorithm (with = 0.001). The 
solid red line is the complexity of the ARS method. The black dots 
indicate the complexity of the ABC SMC algorithm, for each population 
up to the 12 — th one. 
doi:10.1371/journal.pone.0079892.g009 



We have also studied the computational cost of the algorithm. 
In fig. 8 we can see the number of samples or particles we have 
generated in order to have N = 400 accepted particles for each 
population. We can see how the number of particles increases with 
the population index, being significantly high for the last 
population index, since it corresponds to a very small value of 
the synchronization error. Notice that the vertical axis of this 
figure is in a logarithmic scale. 

Comparison of the Methods 

Here we compare not only the accuracy but also the 
computational complexity of all three Monte Carlo methods for 
the joint estimation of the four parameters, Q, \i a , n and 0£, of the 
chaotic intercellular network. To do that, we have calculated for 
the ABC SMC algorithm the computational load up to each 
population. Specifically, the computational complexity of gener- 
ating a sequence of m populations is given by the number of 
samples or particles that have to be generated before completing 
the m — th population. Note that the computational load for the 
m — th population also includes all samples needed to generate the 
m — 1 previous populations. 

Fig. 9 provides a graphical depiction of the complexity of the 
three methods, that we have investigated (ARS, ABC MCMC and 
ABC SMC algorithms). The line in red represents the complexity 
for the ARS method and the line in blue indicates the complexity 
for the ABC MCMC technique. 

The parameter estimation errors attained with the ARS method 
are of the same order as the errors of the parameter estimates 
computed from the 12 — th population of the ABC SMC 
algorithm. However, the number of samples generated to run 
the ARS procedure is x3 x 10 4 while the ABC SMC technique 
demands the generation of « 4 x 10 s random samples up to the 
12 — //? population. The ABC MCMC method achieves the 
poorest performance, as it requires the generation of the highest 
number of samples («5 x 10 5 ) and produces the largest errors (up 
to three orders of magnitude worse than the ARS or ABC SMC 
estimates). 
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Discussion 

We have investigated three computational inference techniques 
for parameter estimation in a chaotic intercellular network that 
consists of two coupled modified repressilators. The proposed 
methodology combines a synchronization-based framework for 
parameter estimation in coupled chaotic systems with some state- 
of-the-art computational inference methods borrowed from 
computational statistics. In particular, we have focussed on an 
accelerated random search algorithm and two approximate 
Bayesian computation schemes (ABC MCMC and ABC SMC). 
The three methods exploit the synchronization property of chaotic 
systems. Therefore, it is not necessary to estimate the initial 
conditions of the variables, which is an important advantage from 
a computational point of view. 

We have carried out the numerical study in this paper assuming 
that only two variables from the primary system can be observed. 
This is the minimum number of observed variables in order to 
guarantee the synchronization of the secondary system. If 
additional variables can be observed it is possible to easily 
incorporate them into the proposal methodology. For example, if 
the variables x\, . . . ,xg are observed (this is the full state of the first 
repressilator and the first variable, x$, of the second repressilator) 
we can redefine the distance function of Eq. (38) as 



%x)*i^ Y,(x i (nT s )-y,(nT s )) 2 . (44) 

n=0 i=\ 

It can be verified (numerically) that using the distance in (44) 
(which intuitively provides "more information" about the primary 
system) leads to more accurate parameter estimates (or, alterna- 
tively, a greater number of parameters can be estimated if 
necessary). Note, however, that this comes at the expense of an 
additional computational effort and, moreover, it is unclear that all 
these variables can be accurately measured in practice. 

The proposed methods can be applied when the observed time 
series are contaminated with additive noise of moderate variance. 
For example, if the observations have the form 
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